24 research outputs found
Asymmetric LOCO Codes: Constrained Codes for Flash Memories
In data storage and data transmission, certain patterns are more likely to be
subject to error when written (transmitted) onto the media. In magnetic
recording systems with binary data and bipolar non-return-to-zero signaling,
patterns that have insufficient separation between consecutive transitions
exacerbate inter-symbol interference. Constrained codes are used to eliminate
such error-prone patterns. A recent example is a new family of
capacity-achieving constrained codes, named lexicographically-ordered
constrained codes (LOCO codes). LOCO codes are symmetric, that is, the set of
forbidden patterns is closed under taking pattern complements. LOCO codes are
suboptimal in terms of rate when used in Flash devices where block erasure is
employed since the complement of an error-prone pattern is not detrimental in
these devices. This paper introduces asymmetric LOCO codes (A-LOCO codes),
which are lexicographically-ordered constrained codes that forbid only those
patterns that are detrimental for Flash performance. A-LOCO codes are also
capacity-achieving, and at finite-lengths, they offer higher rates than the
available state-of-the-art constrained codes designed for the same goal. The
mapping-demapping between the index and the codeword in A-LOCO codes allows
low-complexity encoding and decoding algorithms that are simpler than their
LOCO counterparts.Comment: 9 pages (double column), 0 figures, accepted at the Annual Allerton
Conference on Communication, Control, and Computin
Protecting the Future of Information: LOCO Coding With Error Detection for DNA Data Storage
DNA strands serve as a storage medium for -ary data over the alphabet
. DNA data storage promises formidable information density,
long-term durability, and ease of replicability. However, information in this
intriguing storage technology might be corrupted. Experiments have revealed
that DNA sequences with long homopolymers and/or with low -content are
notably more subject to errors upon storage.
This paper investigates the utilization of the recently-introduced method for
designing lexicographically-ordered constrained (LOCO) codes in DNA data
storage. This paper introduces DNA LOCO (D-LOCO) codes, over the alphabet
with limited runs of identical symbols. These codes come with an
encoding-decoding rule we derive, which provides affordable encoding-decoding
algorithms. In terms of storage overhead, the proposed encoding-decoding
algorithms outperform those in the existing literature. Our algorithms are
readily reconfigurable. D-LOCO codes are intrinsically balanced, which allows
us to achieve balancing over the entire DNA strand with minimal rate penalty.
Moreover, we propose four schemes to bridge consecutive codewords, three of
which guarantee single substitution error detection per codeword. We examine
the probability of undetecting errors. We also show that D-LOCO codes are
capacity-achieving and that they offer remarkably high rates at moderate
lengths.Comment: 14 pages (double column), 3 figures, submitted to the IEEE
Transactions on Molecular, Biological and Multi-scale Communications (TMBMC
A Combinatorial Methodology for Optimizing Non-Binary Graph-Based Codes: Theoretical Analysis and Applications in Data Storage
Non-binary (NB) low-density parity-check (LDPC) codes are graph-based codes that are increasingly being considered as a powerful error correction tool for modern dense storage devices. Optimizing NB-LDPC codes to overcome their error floor is one of the main code design challenges facing storage engineers upon deploying such codes in practice. Furthermore, the increasing levels of asymmetry incorporated by the channels underlying modern dense storage systems, e.g., multi-level Flash systems, exacerbates the error floor problem by widening the spectrum of problematic objects that contributes to the error floor of an NB-LDPC code. In a recent research, the weight consistency matrix (WCM) framework was introduced as an effective combinatorial NB-LDPC code optimization methodology that is suitable for modern Flash memory and magnetic recording (MR) systems. The WCM framework was used to optimize codes for asymmetric Flash channels, MR channels that have intrinsic memory, in addition to canonical symmetric additive white Gaussian noise channels. In this paper, we provide an in-depth theoretical analysis needed to understand and properly apply the WCM framework. We focus on general absorbing sets of type two (GASTs) as the detrimental objects of interest. In particular, we introduce a novel tree representation of a GAST called the unlabeled GAST tree, using which we prove that the WCM framework is optimal in the sense that it operates on the minimum number of matrices, which are the WCMs, to remove a GAST. Then, we enumerate WCMs and demonstrate the significance of the savings achieved by the WCM framework in the number of matrices processed to remove a GAST. Moreover, we provide a linear-algebraic analysis of the null spaces of WCMs associated with a GAST. We derive the minimum number of edge weight changes needed to remove a GAST via its WCMs, along with how to choose these changes. Additionally, we propose a new set of problematic objects, namely oscillating sets of type two (OSTs), which contribute to the error floor of NB-LDPC codes with even column weights on asymmetric channels, and we show how to customize the WCM framework to remove OSTs. We also extend the domain of the WCM framework applications by demonstrating its benefits in optimizing column weight 5 codes, codes used over Flash channels with soft information, and spatially-coupled codes. The performance gains achieved via the WCM framework range between 1 and nearly 2.5 orders of magnitude in the error floor region over interesting channels